Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Weakly supervised salient object detection algorithm based on bounding box annotation
Qiang WANG, Xiaoming HUANG, Qiang TONG, Xiulei LIU
Journal of Computer Applications    2023, 43 (6): 1910-1918.   DOI: 10.11772/j.issn.1001-9081.2022050706
Abstract300)   HTML9)    PDF (3663KB)(188)       Save

Aiming at the inaccurate positioning problem of salient object in the previous weakly supervised salient object detection algorithms, a weakly supervised salient object detection algorithm based on bounding box annotation was proposed. In the proposed algorithm, the minimum bounding rectangle boxes, which are the bounding boxes of all objects in the image were adopted as supervision information. Firstly, the initial saliency map was generated based on the bounding box annotation and GrabCut algorithm. Then, a correction module for missing object was designed to obtain the optimized saliency map. Finally, by combining the advantages of the traditional methods and deep learning methods, the optimized saliency map was used as the pseudo ground-truth to learn a salient object detection model through neural network. Comparison of the proposed algorithm and six unsupervised and four weakly supervised saliency detection algorithms was carried on four public datasets. Experimental results show that the proposed algorithm significantly outperforms comparison algorithms in both Max F-measure value (Max-F) and Mean Absolute Error (MAE) on four datasets. Compared with SBB (Sales Bounding Boxes), which is also a weakly supervised method based on boundary box annotation, the annotation method of the proposed algorithm is simpler. Experiments were conducted on four datasets, ECSSD, DUTS-TE, HKU-IS, DUT-OMRON, and the Max-F increased by 1.82%, 4.00%, 1.27% and 5.33% respectively, and the MAE decreased by 13.89%, 15.07%, 8.77% and 13.33%, respectively. It can be seen that the proposed algorithm is a weakly supervised salient object detection algorithm with good detection performance.

Table and Figures | Reference | Related Articles | Metrics
Parking space detection method based on self-supervised learning HOG prediction auxiliary task
Lei LIU, Peng WU, Kai XIE, Beizhi CHENG, Guanqun SHENG
Journal of Computer Applications    2023, 43 (12): 3933-3940.   DOI: 10.11772/j.issn.1001-9081.2022111687
Abstract142)   HTML3)    PDF (2364KB)(142)       Save

In the intelligent parking space management system, a decrease in accuracy and effectiveness of parking space prediction can be caused by factors such as illumination changes and parking space occlusion. To overcome this problem, a parking space detection method based on self-supervised learning HOG (Histogram of Oriented Gradient) prediction auxiliary task was proposed. Firstly, a self-supervised learning auxiliary task to predict the HOG feature in occluded part of image was designed, the visual representation of the image was learned more fully and the feature extraction ability of the model was improved by using the MobileViTBlock (light-weight, general-purpose, and Mobile-friendly Vision Transformer Block) to synthesize the global information of the image. Then, an improvement was made to the SE (Squeeze-and-Excitation) attention mechanism, thereby enabling the model to achieve or even exceed the effect of the original SE attention mechanism at a lower computational cost. Finally, the feature extraction part trained by the auxiliary task was applied to the downstream classification task for parking space status prediction. Experiments were carried out on the mixed dataset of PKLot and CNRPark. The experimental results show that the proposed model has the accuracy reached 97.49% on the test set; compared to RepVGG, the accuracy of occlusion prediction improves by 5.46 percentage points, which represents a great improvement compared with other parking space detection algorithms.

Table and Figures | Reference | Related Articles | Metrics
Clustering based on discrete hashing
Shuting XUAN, Jinglei LIU
Journal of Computer Applications    2022, 42 (3): 713-723.   DOI: 10.11772/j.issn.1001-9081.2021040911
Abstract268)   HTML8)    PDF (1072KB)(138)       Save

The traditional clustering methods are carried out in the data space, and clustered data is high-dimensional. In order to solve these two problems, a new binary image clustering method, Clustering based on Discrete Hashing (CDH), was proposed. To reduce the dimension of data, L 21 ?norm was used in this framework to realize adaptive feature selection. At the same time, the data was mapped into binary Hamming space by the hashing method. Then, the sparse binary matrix was decomposed into a low-rank matrix in the Hamming space to complete fast image clustering. Finally, an optimization scheme that could converge quickly was used to solve the objective function. Experimental results on image datasets (Caltech101, Yale, COIL20, ORL) show that this method can effectively improve the efficiency of clustering. Compared with the traditional clustering methods,such as K-means and Spectral Clustering (SC),the time efficiency of CDH was improved by 87 and 98 percentage points respectively in the Gabor view of the Caltech101 dataset when processing high-dimensional data.

Table and Figures | Reference | Related Articles | Metrics
Short-term trajectory prediction model of aircraft based on attention mechanism and generative adversarial network
Yuli CHEN, Qiang TONG, Tongtong CHEN, Shoulu HOU, Xiulei LIU
Journal of Computer Applications    2022, 42 (10): 3292-3299.   DOI: 10.11772/j.issn.1001-9081.2021081387
Abstract483)   HTML20)    PDF (1549KB)(272)       Save

Single Long Short-Term Memory (LSTM) network cannot effectively extract key information and cannot accurately fit data distribution in trajectory prediction. In order to solve the problems, a short-term trajectory prediction model of aircraft based on attention mechanism and Generative Adversarial Network (GAN) was proposed. Firstly, different weights were assigned to the trajectory by introducing attention mechanism, so that the influence of important features in the trajectory was able to be improved. Secondly, the trajectory sequence features were extracted by using LSTM, and the convergence net was used to gather all aircraft features within the time step. Finally, the characteristic of GAN optimizing continuously in adversarial game was used to optimize the model in order to improve the model accuracy. Compared with Social Generative Adversarial Network (SGAN), the proposed model has the Average Displacement Error (ADE), Final Displacement Error (FDE) and Maximum Displacement Error (MDE) reduced by 20.0%, 20.4% and 18.3% respectively on the dataset during climb phase. Experimental results show that the proposed model can predict future trajectories more accurately.

Table and Figures | Reference | Related Articles | Metrics
Three-stage question answering model based on BERT
Yu PENG, Xiaoyu LI, Shijie HU, Xiaolei LIU, Weizhong QIAN
Journal of Computer Applications    2022, 42 (1): 64-70.   DOI: 10.11772/j.issn.1001-9081.2021020335
Abstract623)   HTML27)    PDF (918KB)(403)       Save

The development of pre-trained language models has greatly promoted the progress of machine reading comprehension tasks. In order to make full use of shallow features of the pre-trained language model and further improve the accuracy of predictive answer of question answering model, a three-stage question answering model based on Bidirectional Encoder Representation from Transformers (BERT) was proposed. Firstly, the three stages of pre-answering, re-answering and answer-adjusting were designed based on BERT. Secondly, the inputs of embedding layer of BERT were treated as shallow features to pre-generate an answer in pre-answering stage. Then, the deep features fully encoded by BERT were used to re-generate another answer in re-answering stage. Finally, the final prediction result was generated by combining the previous two answers in answer-adjusting stage. Experimental results on English dataset Stanford Question Answering Dataset 2.0 (SQuAD2.0) and Chinese dataset Chinese Machine Reading Comprehension 2018 (CMRC2018) of span-extraction question answering task show that the Exact Match (EM) and F1 score (F1) of the proposed model are improved by the average of 1 to 3 percentage points compared with those of the similar baseline models, and the model has the extracted answer fragments more accurate. By combining shallow features of BERT with deep features, this three-stage model extends the abstract representation ability of BERT, and explores the application of shallow features of BERT in question answering models, and has the characteristics of simple structure, accurate prediction, and fast speed of training and inference.

Table and Figures | Reference | Related Articles | Metrics
Moving object detection algorithm of improved Gaussian mixture model
HUA Yuanlei LIU Wanjun
Journal of Computer Applications    2014, 34 (2): 580-584.  
Abstract417)      PDF (773KB)(501)       Save
For the traditional Gaussian mixture model cannot detect complete moving object and is prone to detect the background as the foreground region, an improved algorithm was proposed for moving object detection based on Gauss mixture model. The Gaussian background model mixed with improved frame difference method for integration, distinguished the uncovered background area and moving object region, which could extract the complete moving object. To give a larger background updating rate of uncovered background area, the background exposure of regional influences was eliminated. In complex scene, it used the method of replacement by background model to improve the stability of the algorithm. The experiments prove that the improved algorithm has been greatly improved in the aspects of adaptability, accuracy, real-time, practicality and so on, and can correctly and effectively detect moving object in the situation with various complicated factors.
Related Articles | Metrics
Improved tone modeling by exploiting articulatory features for Mandarin speech recognition
CHAO Hao YANG Zhanlei LIU Wenju
Journal of Computer Applications    2013, 33 (10): 2939-2944.  
Abstract503)      PDF (1052KB)(537)       Save
Articulatory features, which represent the articulatory information, can help prosodic features to improve the performance of tone recognition. In this paper, a set of 19 pronunciation categories was given according to the pronunciation characteristics of initials and finals. Besides, 19 articulatory tandem features, which are the posteriors of speech signal belonging to the 19 pronunciation categories, were obtained by hierarchical multilayer perceptron classifiers. Then these articulatory tandem features, as well as prosodic features, were used for tone modeling. Tone recognition experiments of three kinds of tone models indicate that about 5% absolute increase of accuracy can be achieved when using both articulatory features and prosodic features. When the proposed tone model is integrated into LVSCR (Large Vocabulary Continuous Speech Recognition) system, the character error rate is reduced significantly.
Related Articles | Metrics
Tasks assignment optimization in Hadoop
HUANG Chengzhen WANG Lei LIU Xiaolong KUANG Yaping
Journal of Computer Applications    2013, 33 (08): 2158-2162.  
Abstract1020)      PDF (756KB)(532)       Save
Hadoop has been widely used in large data parallel processing. The existing tasks assignment strategies are almost oriented to a homogenous environment, but ignore the global cluster state, or not take into account the efficiency of the implementation and the complexity of the algorithm in a heterogeneous environment. To solve these problems, a new tasks assignment algorithm named λ-Flow which was oriented to a heterogeneous environment was proposed. In λ-Flow, the tasks assignment was divided into several rounds. In each round, λ-Flow collected the cluster states and the execution result of the last round dynamically, and assigned tasks in accordance with these states and the result. The comparative experimental result shows that the λ-Flow algorithm performs better in a dynamic changing cluster than the existing algorithms, and reduces the execution time of a job effectively.
Reference | Related Articles | Metrics
Improved syllable-based acoustic modeling for continuous Chinese speech recognition
CHAO Hao YANG Zhanlei LIU Wenju
Journal of Computer Applications    2013, 33 (06): 1742-1745.   DOI: 10.3724/SP.J.1087.2013.01742
Abstract905)      PDF (691KB)(668)       Save
Concerning the changeability of the speech signal caused by co-articulation phenomenon in Chinese speech recognition, a syllable-based acoustic modeling method was proposed. Firstly, context independent syllable-based acoustic models were trained, and the models were initialized by intra-syllable IFs based diphones to solve the problem of training data sparsity. Secondly, the inter-syllable co-articulation effect was captured by incorporating inter-syllable transition models into the recognition system. The experiments conducted on “863-test” dataset show that the relative character error rate is reduced by 12.13%. This proves that syllable-based acoustic model and inter-syllable transition model are effective in solving co-articulation effect.
Reference | Related Articles | Metrics
Fast disparity estimation algorithm based on features of disparity vector
SONG Xiao-wei YANG Lei LIU Zhong LIAO Liang
Journal of Computer Applications    2012, 32 (07): 1856-1859.   DOI: 10.3724/SP.J.1087.2012.01856
Abstract940)      PDF (809KB)(583)       Save
Disparity estimation is a key technology for stereo video compression. Considering the disadvantage of the epipolar correction algorithm, a fast disparity estimation algorithm based on the features of disparity vector was proposed. The algorithm analyzed the features of disparity vector in parallel camera and convergent camera systems respectively, and explained how to find the best matching block by a three-step search according to their features. The algorithm was tested in both 640×480 and 1280×720 resolution sequences. The experimental results show that compared to the original TZSearch algorithm in JMVC, the proposed algorithm can effectively shorten the encoding time and improve coding efficiency without decreasing the image quality and compression efficiency. Because there is not epipolar correction in the proposed algorithm, the disadvantage caused by epipolar correction will not appear.
Reference | Related Articles | Metrics
OMR image segmentation based on mutation signal detection
MA Lei LIU Jiang LI Xiao-peng CHEN Xia
Journal of Computer Applications    2012, 32 (04): 1137-1140.   DOI: 10.3724/SP.J.1087.2012.01137
Abstract1275)      PDF (636KB)(384)       Save
Concerning the accurate positioning of Optical Mark Recognition (OMR) images without any position information, an image segmentation approach of mutation signal detection based on wavelet transformation was proposed. Firstly, the horizontal and vertical projective operations were processed, and then these functions were transformed by wavelet to detect mutation points, which can better reflect the boundary of OMR information. This algorithms adaptability is based on limited times of wavelet transform and mutation signal detection. The experimental results demonstrate that the method possesses high accuracy of segmentation and stability, and the mean square error of segmentation accuracy can be 0.4167 pixels. The processing of this method is efficient because the segmentation only used the horizontal and vertical information. This algorithm is not sensitive to noise because of the statistic characteristic of projection functions and multi-resolution characteristic of wavelet tranformation.
Reference | Related Articles | Metrics